Begin at the beginning: predicting genes with 5' UTRs.
نویسندگان
چکیده
The retrainable, comparative gene predictor N-SCAN integrates multigenome modeling and 5' untranslated region (5' UTR) modeling. In this article, we evaluate N-SCAN's transcription-start site (TSS) and first exon predictions both computationally and experimentally. The computational results indicate that N-SCAN is more accurate than any of the other tools we tested at predicting the TSS and the complete first exon. It is the only one of these tools that can predict complete gene structures together with 5' UTRs. Experimental evaluation shows that N-SCAN can be used to validate novel UTR introns in human gene predictions that do not overlap any RefSeq gene and even to correct RefSeq mRNAs by adding validated UTR exons that are missing from RefSeq.
منابع مشابه
Deep learning of the regulatory grammar of yeast 5' untranslated regions from 500,000 random sequences.
Our ability to predict protein expression from DNA sequence alone remains poor, reflecting our limited understanding of cis-regulatory grammar and hampering the design of engineered genes for synthetic biology applications. Here, we generate a model that predicts the protein expression of the 5' untranslated region (UTR) of mRNAs in the yeast Saccharomyces cerevisiae. We constructed a library o...
متن کاملEvolutionary conservation suggests a regulatory function of AUG triplets in 5′-UTRs of eukaryotic genes
By comparing sequences of human, mouse and rat orthologous genes, we show that in 5'-untranslated regions (5'-UTRs) of mammalian cDNAs but not in 3'-UTRs or coding sequences, AUG is conserved to a significantly greater extent than any of the other 63 nt triplets. This effect is likely to reflect, primarily, bona fide evolutionary conservation, rather than cDNA annotation artifacts, because the ...
متن کاملFolding Free Energies of 5′-UTRs Impact Post-Transcriptional Regulation on a Genomic Scale in Yeast
Using high-throughput technologies, abundances and other features of genes and proteins have been measured on a genome-wide scale in Saccharomyces cerevisiae. In contrast, secondary structure in 5'-untranslated regions (UTRs) of mRNA has only been investigated for a limited number of genes. Here, the aim is to study genome-wide regulatory effects of mRNA 5'-UTR folding free energies. We perform...
متن کاملG-quadruplexes: the beginning and end of UTRs
Molecular mechanisms that regulate gene expression can occur either before or after transcription. The information for post-transcriptional regulation can lie within the sequence or structure of the RNA transcript and it has been proposed that G-quadruplex nucleic acid sequence motifs may regulate translation as well as transcription. Here, we have explored the incidence of G-quadruplex motifs ...
متن کاملIntron size, abundance, and distribution within untranslated regions of genes.
Most research concerning the evolution of introns has largely considered introns within coding sequences (CDSs), without regard for introns located within untranslated regions (UTRs) of genes. Here, we directly determined intron size, abundance, and distribution in UTRs of genes using full-length cDNA libraries and complete genome sequences for four species, Arabidopsis thaliana, Drosophila mel...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Genome research
دوره 15 5 شماره
صفحات -
تاریخ انتشار 2005